|
In computing, the term data warehouse appliance (DWA) was coined by Foster Hinshaw〔(Infostor » Introducing 'data warehouse appliances' )〕〔(TDWI » Still Another Data Warehouse Appliance Is Coming! )〕 to define a new category of computer architecture for data warehousing (DW) specifically targeted for Big Data Analytics and Discovery that is (a) simple to use (not a pre-configuration) and (b) very high performance for this workload. A DWA includes an integrated set of servers, storage, operating system(s), and DBMS. In marketing, the term has evolved to include pre-installed and pre-optimized hardware and software as well as similar software-only systems〔(Queries From Hell blog » When is an appliance not an appliance? ) 〕 promoted as easy to install on specific recommended hardware configurations or preconfigured as a complete system.〔 (DBMS2 — DataBase Management System Services»Blog Archive » Data warehouse appliances – fact and fiction ) 〕〔 Omer Trajman, Alain Crolotte, David Steinhoff, Raghunath Nambiar, Meikel Poess: (Database Are Not Toasters: A Framework for Comparing Data Warehouse Appliances ) 〕 These are marketing uses of the term and do not reflect the technical definition. At its core, a DWA is designed specifically for high performance big data analytics and is delivered as an easy-to-use packaged solution. The internal software (and often hardware) constructs of a DWA differ significantly from a traditional stack in that they are written for a target workload and not a generic general purpose workload. DW appliances are marketed for middle-to-big data applications, most commonly on data volumes in the terabyte to petabyte range. == Technology == The data warehouse appliance (DWA) has several characteristics which differentiate that architecture from similar machines in a data center, such as an enterprise data warehouse (EDW). 1. A DWA has a very tight integration of its internal components which are optimized for "data-centric" operations in contrast to "compute-centric" operations. The latter tend to emphasize number of CPU's, cores and network bandwidth. 2. A DWA is trivial to use and install. In contrast to a "pre-configuration" of components, a DWA has very few configuration switches or options. The elimination of such options significantly reduces configuration error – the number one cause for failure in large systems. 3. A DWA is optimized for analytics on Big Data. In contrast, preceding architectures (including parallel ones) focused on "enterprise data warehouse" being a general-purpose repository for data and supporting analytics as an ancillary task. 4. A DWA has high performance for analytics on Big Data. The price-performance is usually 10X and often 50X that of earlier architectures such as EDW. Most DW appliances use massively parallel processing (MPP) architectures to provide high query performance and platform scalability. MPP architectures consist of independent processors or servers executing in parallel. Most MPP architectures implement a "shared-nothing architecture" where each server operates self-sufficiently and controls its own memory and disk. DW appliances distribute data onto dedicated disk storage units connected to each server in the appliance. This distribution allows DW appliances to resolve a relational query by scanning data on each server in parallel. The divide-and-conquer approach delivers high performance and scales linearly as new servers are added into the architecture. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Data warehouse appliance」の詳細全文を読む スポンサード リンク
|